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' Abstract 

(N 

We define a variant of k-of-n testing that we call conservative k-of-n testing. We present a polynomial- 
time, combinatorial algorithm for the problem of maximizing throughput of conservative fc-of-n testing, 
in a parallel setting. This extends previous work of Kodialam and Condon et al., who presented 
combinatorial algorithms for parallel pipelined filter ordering, which is the special case where k — 1 (or 
k = n) [7, 3, 4]. We also consider the problem of maximizing throughput for standard k-of-n testing, and 
show how to obtain a polynomial-time algorithm based on the ellipsoid method using previous techniques. 
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1 Introduction 
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. In standard k-of-n testing, there are n binary tests, that can be applied to an "item" x. We use X4 to denote 

the value of the i th test on x, and treat x as an element of {0, 1}". With probability Pi, Xi — 1, and with 
probability 1 — pi, x^ — 0. The tests are independent, and we are given pi, . . . ,p n . We need to determine 
whether at least k of the n tests on x have a value of 0, by applying the tests sequentially to x. Once we 
have enough information to determine whether this is the case, that is, once we have observed k tests with 
value 0, or n — k + 1 tests with value 1, we do not need to perform further tests. 1 

We define conservative k-of-n testing the same way, except that we continue performing tests until we 
have either observed k tests with value 0, or have performed all n tests. In particular, we do not stop testing 
when we have observed n — k + 1 tests with value 1 . 

There are many applications where k-of-n testing problems arise, including quality testing, medical 
diagnosis, and database query optimization. In quality testing, an item x manufactured by a factory is 
tested for defects. If it has at least k defects, it is discarded. In medical diagnosis, the item a: is a patient; 
patients are diagnosed with a particular disease if they fail at least k out of n special medical tests. A 
database query may ask for all tuples x satisfying at least k of n given predicates (typically k — 1 or k = n). 

For k = 1, standard and conservative k-of-n testing are the same. For k > 1, the conservative variant is 
relevant in a setting where, for items failing fewer than k tests, we need to know which tests they failed. For 
example, in quality testing, we may want to know which tests were failed by items failing fewer than k tests 
(i.e. those not discarded) in order to repair the associated defects. 

Our focus is on the MaxThroughput problem for fc-of-71 testing. Here the objective is to maximize the 
throughput of a system for fc-of-n testing in a parallel setting where each test is performed by a separate 
"processor". In this problem, in addition to the probabilities pi, there is a rate limit ri associated with the 
processor that performs test i, indicating that the processor can only perform tests on items per unit time. 

MaxThroughput problems are closely related to MinCost problems [8, 5]. In the MinCost problem 
for k-of-n testing, in addition to the probabilities pi, there is a cost a associated with performing the i th 
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test. The goal is to find a testing strategy (i.e. decision tree) that minimizes the expected cost of testing 
an individual item. There are polynomial-time algorithms for solving the MinCost problem for standard 
k-oi-n testing [9, 10, 1, 2]. 

Kodialam was the first to study the MaxThroughput k-oi-n testing problem, in the special case 
where k = 1 [7]. He gave a 0(n 3 logn) algorithm for the problem. The algorithm is combinatorial, but its 
correctness proof relies on polymatroid theory. Later, Condon et. al. studied the problem, calling it "parallel 
pipelined filter ordering". They gave a 0(n 2 ) combinatorial algorithm, with a direct correctness proof [4]. 

In this paper, we extend the previous work by giving a polynomial-time combinatorial algorithm for 
the MaxThroughput problem for conservative k-oi-n testing. Our algorithm can be implemented using 
simple dynamic programming to run in time 0(kn 2 ), which is 0(n 2 ) for constant k. We leave for future 
work the problem of reducing the runtime for non-constant k. 

The MaxThroughput problem for standard fc-of-n testing appears to be fundamentally different from 
its conservative variant. We leave as an open problem the task of developing a polynomial time combinatorial 
algorithm for this problem. We show that previous techniques can be used to obtain a polynomial-time 
algorithm based on the ellipsoid method. This approach could also be used to yield an algorithm, based on 
the ellipsoid method, for the conservative variant. 

2 Related Work 

Deshpande and Hellerstein studied the MaxThroughput problem for k = 1, when there are precedence 
constraints between tests [5]. They also showed a close relationship between the exact MinCost and 
MaxThroughput problems for k-oi-n testing, when k = 1. Their results can be generalized to apply to 
testing of other functions. 

Liu et al. [ ] presented a generic, linear-programming based, method for converting an approximation 
algorithm for a MinCost problem, into an approximation algorithm for a MaxThroughput problem. 
Their results are not applicable to this paper, where we consider only exact algorithms. 

Polynomial-time algorithms for the MinCost problem for standard k-oi-n testing were given by by 
Salloum, Breuer, Ben-Dov, and Chang et al. [9, 10, 1, 2, 11]. 

The problem of how to best order a sequence of tests, in a sequential setting, has been studied in many 
different contexts, and in many different models. See e.g. [8] and [4] for a discussion of related work on the 
filter-ordering problem (i.e. the MinCost problem for k = 1) and its variants, and [12] for a general survey 
of sequential testing of functions. 

3 Problem Definitions 

A k-of-n testing strategy for tests l,...,n is a binary decision tree T that computes the k-oi-n function, 
/ : {0, 1}™ — > {0, 1}, where f(x) = 1 if and only if x contains fewer than k 0's. Each node of T is labeled by 
a variable xi. The left child of a node labeled with Xi is associated with x\ = (i.e., failing test i), and the 
right child with x^ = 1 (i.e., passing test i). Each x G {0, 1}™ corresponds to a root-to-leaf path in the usual 
way, and the label at the leaf is f(x). 

A k-oi-n testing strategy T is conservative if, for each root-to-leaf path leading to a leaf labeled 1, the 
path contains exactly n non-leaf nodes, each labeled with a distinct variable x\. 

Given a permutation 7r of the n tests, we define T£(n) to be the conservative strategy described by the 
following procedure: Perform the tests in order of permutation tt until at least k 0's have been observed, or 
all tests have been performed, whichever comes first. Output in the first case, and 1 in the second. 

Similarly, we define T£(tt) to be the following standard k-oi-n testing strategy: Perform the tests in order 
of permutation tt until at least k 0's have been observed, or until n — k + 1 l's have been observed, whichever 
comes first. Output in the first case, and 1 in the second. 

Each test i has an associated probability pi, where < pi < 1. Let D p denote the product distribution on 
{0, 1}™ defined by the pi's; that is, if x is drawn from D p , then Vi, Pr[xi = 1] = pt and the Xi are independent. 
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We use x ~ Z? p to denote a random x drawn from D p . In what follows, when we use an expression of the 
form Prob[. . .] involving an item x, we mean the probability with respect to D p . 

3.1 The MinCost problem 

In the MinCost problem for standard fc-of-n testing, we are given n probabilities pi and costs Ci > 0, for 
i G {1, . . . , n}, associated with the tests. The goal is to find a fe-of-n testing strategy T that minimizes the 
expected cost of applying T to a random item x ~ D p . The cost of applying a testing strategy T to an item 
x is the sum of the costs of the tests along the root-to-leaf path for x in T. 

In the MinCost problem for conservative fc-of-n testing, the goal is the same, except that we are restricted 
to finding a conservative testing strategy. 

For example, consider the MinCost 2-of-3 problem with probabilities pi = p 2 = 1/2, p 3 = 1/3 and costs 
Ci = 1, C2 = C3 = 2. A standard testing strategy for this problem can be described procedurally as follows: 
Given item x, begin by performing test 1. If X\ = 1, follow strategy T| ("Ki), where 7Ti = (2, 3). Else if X\ = 0, 
follow strategy Tf (^2), where H2 — (3,2). 

Under the above strategy, which can be shown to be optimal, evaluating x = (0, 0, 1) costs 5, and 
evaluating x' — (1, 1, 0) costs 3. The expected cost of applying this strategy to a random item x ~ D p is 3§. 

Because the MinCost testing strategy may be a tree of exponential size in the number of tests, algorithms 
for the MinCost problem may output a compact representation of the output strategy. 

3.2 The MaxThroughput problem 

The MaxThroughput problem for fc-of-n testing is a natural generalization of the MaxThroughput 
problem for 1-of-n testing, first studied by Kodialam [7] . We give basic definitions and motivation here. For 
further information, see [7, 3, 4]. 

In the MaxThroughput problem for fc-of-n testing, as in the MinCost problem, we are given the 
probabilities pi, . . . ,p n associated with the tests. Instead of costs Cj for the tests, we are given rate limits 
ri > 0. The MaxThroughput problem arises in the following context. There is an (effectively infinite) 
stream of items x that need to be tested. Every item x must be assigned a strategy T that will determine 
which tests are performed on it. Different items may be assigned to different strategies. Each test is performed 
by a separate "processor", and the processors operate in parallel. (Imagine a factory testing setting.) Item 
x is sent from processor to processor for testing, according to its strategy T. Each processor can only test 
one item at a time. We view the problem of assigning items to strategies as a flow-routing problem. 

Processor Oi performs test i. It has rate limit (capacity) n, indicating that it can only process rj items 
x per unit time. 

The goal is to determine how many items should be assigned to each strategy T, per unit time, in order 
to maximize the number of items that can be processed per unit time, the throughput of the system. The 
solution must respect the rate limits of the processors, in that the expected number of items that need to 
be tested by processor Oi per unit time must not exceed rj. We assume that tests behave according to 
expectation: if m items are tested by processor Oi per unit time, then mpi of them will have the value 1, 
and m(l — Pi) will have the value 0. 

Let T denote the set of all fc-of-n testing strategies and T c denote the set of all conservative fc-of-n testing 
strategies. Formally, the MaxThroughput problem for standard fc-of-n testing is defined by the linear 
program below. The linear program defining the MaxThroughput problem for conservative fc-of-n testing 
is obtained by simply replacing the set of fc-of-n testing strategies T by the set of conservative fc-of-n testing 
strategies T c - 

We refer to a feasible assignment to the variables zt in the below LP as a routing. We call con- 
straints of type (1) rate constraints. The value of F is the throughput of the routing. For i 6 {1, . . . , n}, if 
^2 T £T g{T,i)zT = ri, we say that the routing saturates processor Oi. 

We will refer to the MaxThroughput problems for standard and conservative fc-of-n testing as "SMT(fc) 
problem" and "CMT(fc) problem", respectively. 
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As a simple example, consider the following CMT(fc) problem (equivalently, SMT(fc) problem) instance, 
where k = 1 and n — 2: r\ = 1, r 2 = 2, p\ = 1/2, p 2 — 1/4. There are only two possible strategies, 
Ti(7Ti), where tti = (1,2), and Ti(7r 2 ), where 7r 2 = (2,1). Since all flow assigned to Ti(tti) is tested by 
Oi, g(Ti(iTi), 1) = 1; this flow continues on to 2 only if it passes test 1, which happens with probability 
pi = 1/2, so S (ri(7ri),2) = 1/2. Similarly, s (Ti(7r 2 ),2) - 1 while g(T^ % ),\) = 1/4, since p 2 = 1/4. 
Consider the routing that assigns F\ = 4/7 units of flow to strategy Ti(tti), and F 2 = 12/7 units to strategy 
Ti(7r 2 ). Then the amount of flow reaching 0\ is 4/7 x g{Ti(iii), 1) + 12/7 x g(Ti(n 2 ), 1) = 1, and the amount 
of flow reaching 2 is 4/7 x g(Ti(m), 2) + 12/7 x £r(Ti(7r 2 ), 2) = 2. Since n = 1 and r 2 = 2, this routing 
saturates both processors. By the results of Condon et al. [4], it is optimal. 



MaxThroughput LP: 

Given 7*1, . . . ,r n > and pi ■ ■ ■ ,p n G (0, 1), find an assignment to the variables zt, for all T £ T, that 
maximizes 

subject to the constraints: 

(1) YlreT i) z T < r i f° r au * G {1) • • • , n} and 

(2) z T > for all T e T 

where g(T, i) denotes the probability that test i will be performed on an item x that is tested using strategy 
T, when x ~ £> p . 



4 The Algorithm for the MinCost Problem 

In the literature, versions of the MinCost problem for 1-of-n testing are studied under a variety of different 
names, including pipelined filter ordering, selection ordering, and satisficing search (cf. [4]). 

The following is a well-known, simple algorithm for solving the MinCost problem for standard 1-of-n 
testing (see e.g. [(>]): First, sort the tests in increasing order of the ratio Cj/(1 — pi). Next, renumber the 
tests, so that ci/(l — p%) < c 2 /(l — p 2 ) < . . . < c n / (1 — p n ). Finally, output the sorted list 7r = (1, . . . , n) of 
tests, which is a compact representation of the strategy T((tt) (which is the same as T[(ir)). 

The above algorithm can be applied to the MinCost problem for conservative fc-of-n testing, simply 
by treating tt as a compact representation of the conservative strategy Tf;(ir). We now show that T^(tt) 
is, in fact, a MinCost conservative strategy. The proof is a generalization of the proof of correctness for 
1-of-n testing, and is slightly more complicated here in part because we need to consider the possibility 
that the optimal strategy corresponds to a complicated decision tree, rather than one specified by a single 
permutation. 

Lemma 1. The strategy Tu{~k) output by the above algorithm has minimum expected cost among all conser- 
vative strategies. 

Proof. Our proof is by induction on n. The base case, where n = 1 is trivially true for all k. For the inductive 
step, suppose n > 1. Let T* be a conservative decision tree of minimum expected cost for the instance. 
Assume for contradiction that strategy T£(jr) does not have minimum expected cost. To simplify the proof, 
we assume in what follows that the values c,/(l —pi) are distinct for all tests i. It is not difficult to eliminate 
this assumption. Index the tests so that ci/(l — pi) < c 2 /(l — p 2 ) < . . . < c„/(l — p n ). 

Tree T* has a root labeled with a test i, and a left and right subtree T£ and T^. After seeing the value 
of test i at the root of T* , we still need to look for either k — 1 or k additional 0's. Thus T£ and are 
min-cost solutions for induced (k — l)-of-(n — 1) and fc-of-(n — 1) testing instances respectively, 

Consider running the algorithm on both of these induced instances. The algorithm outputs the same 
permutation tt — (1, 2, . . . , i — 1, i + 1, . . . , n) in both cases. By induction, T^^tt) and T£(jr) are optimal for 
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the induced instances. Let T' be the tree produced from T* by replacing the optimal left and right subtrees 
of T* by the optimal subtrees T^_ 1 (tt) and T£(tt) respectively. Clearly, T' is also an optimal tree for the 
original problem instance. 

Let tt' = (i, - l,i + l,...,n). Thus T = T£(tt'). If i = 1, then tt = tt' , and because T fc c (7r') is 

optimal, so is Tg(ir). This contradicts our assumption that T£(tt) is not optimal. Therefore, 

Now consider permutation tt' — (i, 1, 2, . . . , i — 1, i + 1, . . . , n). Move i forward in the permutation, one 
spot at a time, until the the resulting permutation tt" is such that T^(tt") is not optimal. Since T^(tt) is not 
optimal, cither tt" = 1, 2, . . . j — 1, i, j, . . . , i— 1, . . .n for some j < i, or 7r" = 1, 2, . . . , i— 1, i, . . .n = tt 
(in which case, let j — i). 

By the definition of j, swapping j — 1 and i in tt" yields a permutation fr whose associated strategy Tk(fr) 
is optimal. We now compare the expected costs of Tf;(ir") and Tk(n) on an element x 6 {0, 1}™. If the values 
of tests 1 ... ,j — 2 on input x include at least k 0's, or strictly fewer than k — 1 0's, then the cost of using 
T^(tt") and T^tt) on x are equal. If the values of tests 1, . . . ,j — 2 on input x include exactly k — 1 0's, 
then the costs of T£(tt") and Tfc(7r) may differ. Both trees perform tests 1, ... ,j — 2 first. If Xj_i = 0, then 
T£(tt") will perform only that one additional test, while if Xj-\ = 1, then T£(tt") will perform both tests 
j — 1 and i. Similarly, if Xi — 0, then T# will perform only the one additional test i, while if Xi = 1, then 
7# will perform both tests j — 1 and i. Any subsequent testing will be the same in both trees. Because the 
expected cost of T^(tt") is greater than the expected cost of Tk(n), Cj_i + Pj-ify > Ci +PiCj-i, and thus 

Cj-l/(l -Pj-l) > Ci/(1-Pi). 

Since j < i, this violates the sorted ordering of tt, which is a contradiction. □ 

5 The Algorithm for the CMT(fc) problem 

We begin with some useful lemmas. The algorithms of Condon et al. for maximizing throughput of 1-of-ra 
testing rely crucially on the fact that saturation of all processors implies optimality. We show that the same 
holds for conservative k-of-n testing. 

Lemma 2. Let R be a routing for an instance of the conservative k-of-n testing problem. If R saturates all 
processors, then it is optimal. 

Proof. Each processor Oi can test at most items per unit time. Thus at processor Oi, there are at most 
— Pi) tests performed that have the value 0. Let / denote the k-oi-n function. 

Suppose R is a routing achieving throughput F. Since F items enter the system per unit time, F items 
must also leave the system. An item x such that f(x) = does not leave the system until it fails k tests. 
An item x such that f(x) = 1 does not leave the system until it has had all tests performed on it. Thus, per 
unit time, in the entire system, the number of tests performed that have the value must be F x M, where 
M = (k ■ Prob[x has at least k 0's] + X^j=o 3 ' Pi~ob[x has exactly j 0's]). 

Since at most rj(l — Pi) tests with the value can occur per unit time at processor Oi, F x M < 
Sr=i r «(l ~ Pi)- Solving for F, this gives an upper bound of F < Y^i=i r i(^- ~ Pi)/M on the maximum 
throughput. This bound is tight if all processors are saturated, and hence a routing saturating all processors 
achieves the maximum throughput. □ 

In the above proof, we rely on the fact that every routing with throughput F results in the same number 
of test values being generated in the system per unit time. Note that this is not the case for standard 
testing, where the number of test values generated can depend on the routing itself, and not just on the 
throughput of that routing. We now give a simple counterexample showing that, in fact, saturation does not 
imply optimality for the SMT(fc) problem. Consider the MaxThroughput 2-of-3 testing instance where 
pi = 1/2, pa = 1/4, p 3 = 3/4, and n = 2,r 2 = lf,r 3 = if. 

The following is a 2-of-3 testing strategy: Given item x, peform test 1. If x\ = l, follow strategy Tf(iTi), 
where m — (2, 3). Else if x\ = 0, follow strategy T-f^), where 7Ti = (3, 2). 

Assigning 2 units of flow to this strategy saturates the processors: 0\ is saturated since it receives the 2 
units entering the system, 2 is saturated since it receives 1 = 2 xpi units from 0\ and 3/4 = 2xp :j x (1—pi) 
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items from 0\,0<2. Similarly, O3 is saturated since it receives 1 = 2 x (1 — p\) units from 0\ and 3/4 = 
2 x (1 - p a ) x pi units from O1O3. 

We show that the routing is not optimal by giving a different routing with higher throughput. The 
routing uses two strategies. The first is as follows: Given item x, perform test 1. If x\ = 1, follow strategy 
T|(7Ti), where iri — (3,2). Else, if xi =0 follow strategy Tf(iri), where tt2 = (2,3). The second strategy 
used by the routing is T|(7r 3 ), where 7r 3 = (3,2,1). Assigning F — \\ units to the first strategy uses li 
units of the capacity of Oi, 15/16 = \\ x (1 — pi) ll^x^xfl- p 3 ) units of the capacity of O2, and 
15/16 = l| x (1 - pi) + li X (1 — p\) x pi of the capacity of O3. This leaves O2 and O3 with residual 
capacity more than 3/4 < 1 + 3/4 — 15/16, and 0\ with residual capacity 1/2 = 2 — 1^. We can then assign 
3/4 additional units to the second strategy without violating any of the rate constraints, for a routing with 
total throughput 2j. (The resulting routing is not optimal, but illustrates our point.) 

The routing produced by our algorithm for the CMT(fc) problem uses only strategies of the form T£(ir), 
for some permutation it of the tests (in terms of the LP, this means zt > only if T — T£(n) for some 7r). 
We call such a routing a permutation routing. We say that it has a saturated suffix if for some subset Q of 
the processors (1) R saturates all processors in Q, and (2) for every strategy T[:(ir) used by R, the processors 
in Q appear together in a suffix of n. 

With this definition, and the above lemma, we are now able to generalize a key lemma of Condon et 
al. to apply to conservative fc-of-n testing. The proof is essentially the same as theirs; we present it below 
for completeness. 

Lemma 3. (Saturated Suffix Lemma) Let R be a permutation routing for an instance of the CMT(k) prob- 
lem. If R has a saturated suffix, then R is an optimal solution for the instance. 

Proof. If R saturates all processors, then the previous lemma guarantees its optimality. If not, let L denote 
the set of processors not saturated by R. Imagine that we removed the rate constraints for each processor 
in L. Let R' be an optimal routing for the resulting problem. We may assume that on any input x, R' 
performs the tests in L in some fixed arbitrary order (until and unless k tests with value are obtained), 
prior to performing any tests in Q. This assumption is without loss of generality, because if not, we could 
modify R' to first perform the tests in L without violating feasibility, since the processors in L have no rate 
constraints, and performing their tests first can only decrease the load on the other processors. Thus the 
throughput attained by R' is Tr x where Tr denotes the maximum throughput achievable just with the 
operators in Q, and pl is the probability that a random x will have the value for fewer than k of the tests 
in L (i.e. it will not be eliminated by the tests in L). 

Routing R also routes flow first through L, and then through Q. Since it saturates the operators in Q, 
by the previous lemma, it achieves maximum possible throughput with those operators. It follows that R 
achieves the same throughput as R' , and hence is optimal for the modified instance where processors in L 
have no rate constraints. Since removing constraints can only increase the maximum possible throughput, 
it follows that R is also optimal for the original instance. □ 

5.1 The Equal Rates Case 

We begin by considering the CMT(fc) problem in the special case where the rate limits r, t are equal to 
some constant value r for all processors. Condon et al. presented a closed-form solution for this case when 
k = 1 [4], The solution is a permutation routing that uses n strategies of the form Ti(7r). Each permutation 
7r is one of the n left cyclic shifts of the permutation (1, . . . ,n). More specifically, for i £ {1, . . . , n}, let 
7r,; = (i, i + 1, . . . , n, 1, 2, . . . , i— 1), and let Tj = (71^). The solution assigns r(l — p.;_i)/(l —p\ ■ ■ -p n ) units 
of flow to each T,;. By simple algebra, Condon et al. verified that the solution saturates all processors. Hence 
it is optimal. 

The solution of Condon et al. is based on the fact that for the 1-of-n problem, assigning (1 — Pi-i) flow 
to each Tj equalizes the load on the processors. Surprisingly, this same assignment equalizes the load for the 
k-oi-n problem as well. Using this fact, we obtain a closed-form solution to the CMT(fc) problem. 
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Lemma 4. Consider an instance of the CMT(k ) problem where all processors have the same rate limit r. For 
i G {1, . . . , n\, let Tj be as defined above. Let X a ^ = Y] e _ (1 — xg). The routing that assigns r(l — pj_i)/a 
items per unit time to strategy Tj saturates all processors, where a = X)t=i P T °b\X\. n > t]. 

Proof. We begin by considering the routing in which (1 — Pi—i) units of flow are assigned to each Ti. Consider 
the question of how much flow arrives per unit time at processor 1, under this routing. For simplicity, assume 
now that k = 2. Thus as soon as an item has failed 2 tests, it is discarded. Let qi = (1 — pi). 

Of the q n units assigned to strategy Ti, all q n arrive at processor 1. Of the q n -i units assigned to strategy 
T n , all q n -i arrive at processor 1, since they can fail either or 1 test (namely test n) beforehand. 

Of the q n -2 units assigned to strategy T„_i, the number reaching processor 0\ is q n —2p n -ij where 
/3„_i is the probability that an item fails either or 1 of tests n — 1 and n. Therefore, /3 n -i = 1 — 
Qn-iQn- More generally, for i G {1, . . . ,n}, of the gj_i units assigned to Ti, the number reaching processor 
1 is where f3i is the probability that a random item fails a total of or 1 of tests i, i + 1, . . . , n. 

Thus, f3i = Prob[Xi tJl = 0] + Prob[Xi, n = 1]. It follows that the total flow arriving at Processor 1 is 
Eti<li-i(Prob[X i<n = Q]) + qi-i{Prob\X i<n = 1]). 

Consider the second summation, (qi-iP r °b[Xi_ n = 1]). We claim that this summation is equal to 

Prob[X\ n > 2], which is the probability that x has at least two x^s that are 0. To see this, consider a 
process where we observe the value of x n , then the value of x n -\ and so on down towards x\, stopping if and 
when we have observed exactly two 0's. The probability that we will stop at some point, having observed 
two 0's, is clearly equal to the probability that x has at least two Xi's that are set to 0. The condition 
53j=i(l — x j) = 1 is satisfied when exactly 1 of x n ,x n ^\, . . . , Xi has the value 0. Thus qi-\{Prob\Xi n = 1]) 
is the probability that we observe exactly one and then we observe a second at Xi—\. That 

is, it is the probability that we stop after observing sc,*_i. Since the second summation takes the sum of 
qi^iProb[Xi n = 1] over all i between 1 and n, the summation is precisely equal to the probability of 
stopping at some point in the above process, having seen two 0's. This proves the claim. 

An analogous argument shows that the first summation, ^Z i=1 (qi-iProb[Xi^ n = 0]), is equal to 
Prob[X hn > 1]. 

It follows that the amount of flow reaching Processor 1 is Prob[Xi n > 1] + Prob\X\_ n > 2]. This 
expression is symmetric in the processor numbers, so the amount of flow reaching every Oi is equal to 
this value. Thus the above routing causes all processors to receive the same amount of flow. Hence, if all 
processors have the same rate limit r, scaling this routing by an appropriate multiplicative factor will yield a 
routing that saturates all processors. More particularly, the routing that assigns rq^xj {Prob\X\^ n > 1] + 
Prob[Xi n > 2]). units to each strategy Ti will saturate all processors if they have common rate limit r. 

The above argument for k = 2 can easily be extended to arbitrary k. The resulting saturating routing 
for arbitrary k, when all processors have rate limit r, assigns ^9i-i/(53t=i P r °b[Xi. n > t]). items per unit 
time to strategy Ti. □ 

5.2 The Equalizing Algorithm of Condon et al. 

Our algorithm for the CMT(fc) problem is an adaptation of one of the two MaxThroughput algorithms, 
for the special case where k = 1, given by Condon et al. [4]. We begin by reviewing that algorithm, which 
we will call the Equalizing Algorithm. Note that when k = 1, it only makes sense to consider strategies that 
are permutation routings, since an item can be discarded as soon as it fails a single test. 

Consider the CMT(fe) problem for k = 1. View the problem as one of constructing a flow of items 
through the processors. The capacity of each processor is its rate limit, and the amount of flow sent along 
a permutation tt (i.e., assigned to strategy Ti(ir)) is equal to the number of items sent along that path per 
unit time. Sort the tests by their rate limits, and re- number them so that r n > r n -\ > . . . > T\. Assume for 
the moment that all rate limits are distinct. 

The Equalizing Algorithm constructs a flow incrementally as follows. Imagine pushing flow along the 
single permutation (n, . . . , 1). Suppose we continuously increase the amount of flow being pushed, beginning 
from zero, while monitoring the "residual capacity" of each processor, i.e., the difference between its rate 
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limit and the amount of flow it is already receiving. (For the moment, let us not worry about exceeding the 
rate limit of an processor.) 

Consider two adjacent processors, i and i — 1. As we increase the amount of flow, the residual capacity 
of each decreases continuously. Initially, at zero flow, the residual capacity of i is greater than the residual 
capacity of i — 1. It follows by continuity that the residual capacity of i cannot become less than the residual 
capacity of i — 1 without the two residual capacities first becoming equal. We now impose the following 
stopping condition: increase the flow sent along permutation (n, . . . , 1) until either (1) some processor 
becomes saturated, or (2) the residual capacities of at least two of the processors become equal. The second 
stopping condition ensures that when the flow increase is halted, permutation (n, . . . , 1) still orders the 
processors in decreasing order of their residual capacities. (Algorithmically, we do not increase the flow 
continuously, but instead directly calculate the amount of flow which triggers the stopping condition.) 

If stopping condition (1) above holds when the flow increase is stopped, then the routing can be shown 
to have a saturated suffix, and hence it is optimal. 

If stopping condition (2) holds, we keep the current flow, and then augment it by solving a new Max- 
Throughput problem in which we set the rate limits of the processors to be equal to their residual capacities 
under the current flow (their p^s remain the same). 

We solve the new MaxThroughput problem as follows. We group the processors into equivalence 
classes according to their rate limits. We then replace each equivalence class with a single mega-processor, 
with a rate limit equal to the rate limit of the constituent processors, and probability pi equal to the product 
of their probabilities. We then essentially apply the procedure for the case of distinct rate limits to the mega 
processors. 

The one twist is the way in which we translate flow sent through a mega-processor into flow sent through 
the constituent processors of that mega-processor; we route the flow through the constituent processors 
so as to equalize their load. We accomplish this by dividing the flow proportionally between the cyclic 
shifts of a permutation of the processors, using the same proportional allocation as used in the saturating 
routing of Lemma 4. We thus ensure that the processors in each equivalence class continue to have equal 
residual capacity. Note that, under this scheme, the residual capacity of a processor in a mega-processor 
may decrease more slowly than it would if all flow were sent directly to that processor (because some flow 
may first be filtered through other processors in the mega-processor) and this needs to be taken into account 
in determining when the stopping condition is reached. 

Here is an example illustrating the Equalizing Algorithm. We can observe how our algorithm works on 
the following l-of-3 CMT(fc) problem (which, since k = 1 is the same as SMT(fc) problem.) Suppose we have 
3 processors, 0\, C?2,03 with rate limits r\ = 3,r 2 = 14, and r 3 = 18, and probabilities pi = 1/8, P2 = 1/2 
and pz = 1/3. When flow is sent along O3, O2, Oi, after 6 units of flow is sent we have a stopping condition 
when O3 and O2, have the same residual capacity of 12; the residual capacity of Oi is 2. 

Our algorithm then does a recursively call where the operators O3 and O2 are combined into a mega- 
processor 02,3; which has p2,3 = 1/2x1/3= 1/6. How flow is sent through the mega-processor, 02,3 is by 
sending 3/7 fraction through 03,02 and 4/7 fraction through 02,03; we observe that for one unit of flow 
sent through 02,3 the amount of capacity used by each processor is 3/7 + 2/7 = 5/7. Flow is now sent along 
02,3, Oi; after 12 units of flow, we reach a stopping condition when 0\ is saturated. Even though 2 and 3 
are not saturated (they have 12 — 12 x 5/7 residual capacity left) the flows constructed as described provide 
optimal throughput. 

The Equalizing Algorithm, implemented in a straightforward way, produces a routing that may use an 
exponential number of different permutations. Condon et al. describe methods for reducing the number of 
permutations used [ ]. 

5.3 An Equalizing Algorithm for the CMT(A;) problem 

We prove the following theorem. 

Theorem 5. There is a 0(kn 2 ) combinatorial algorithm for solving the CMT(k) problem. 
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Proof. We extend the Equalizing Algorithm of Condon et al., to apply to arbitrary values of k. 

Again, we will push flow along the permutation of the processors (n, . . . , 1) (where r„ > r„_i > . . . > n) 
until one of the two stopping conditions is reached: (1) a processor is saturated, or (2) two processors have 
equal residual capacity. Here, however, we do not discard an item until it has failed k tests, rather than 
discarding it as soon as it fails one test. To reflect this, we divide the flow into k different types, numbered 
through k — 1, depending on how many tests its component items have failed. Flow entering the system 
is all of type 0. 

When m flow of type t enters a processor Oi, pirn units pass test Oj, and (1 — pi)m units fail it. So, if 
r < k — 1, then of the m incoming units of type r, (1 — Pi)m units will exit processor Oi as type r + 1 flow, 
and pirn will exit as type r flow. Both types will be passed on to the next processor in the permutation, if 
any. If r = k — 1, then piin units will exit as type r flow and be passed on to the next processor, and the 
remaining (1 — pi)m will be discarded. 

Algorithmically, we need to calculate the minimum amount of flow that triggers a stopping condition. 
This computation is only slightly more complicated for general k than it is for k = 1. The key is to compute, 
for each processor Oj, what fraction of the flow that is pushed into the permutation will actually reach 
processor O, (i.e. we need to compute the quantity g(T£(7r),i) in the LP.) 

If stopping condition (2) holds, we keep the current flow, and augment it by solving a new MaxThrough- 
put problem in which we set the rate limits of the processors to be equal to their residual capacities under 
the current flow (their p^s remain the same.) To solve the new MaxThroughput problem, we again group 
the processors into equivalence classes according to their rate limits, and replace each equivalence class with a 
single mega-processor, with a rate limit equal to the rate limit of the constituent processors, and probability 
Pi equal to the product of their probabilities. 

We then want to apply the procedure for the case of distinct rate limits to the mega processors. To do 
this, we need to translate flow sent into a mega-processor into flow sent through the constituent processors 
of that mega-processor, so as to equalize their load. We do this translation separately for each type of 
flow entering the mega-processor. Flow of type r entering the mega-processor is sent into the constituent 
processors of the mega-processor according to the closed-form solution for (A; — r)-of-n' testing, where n' is 
the number of consituent processors of the mega-processor, because it will be discarded if it fails k — r more 
tests. We also need to compute how much flow of each type ends up leaving the mega-processor (some of the 
incoming flow of type r entering the mega-processor may, for example, become outgoing flow of type T + n'), 
and how much its residual capacity is reduced by the incoming flow. The algorithm can be implemented to 
run in time 0(kn 2 ). We give further details, with pseudocode below. □ 

Pseudocode The pseudocode is presented below. The following information will be helpful in under- 
standing it. 

At each stage of the algorithm, the processors are partitioned into equivalence classes. The proces- 
sors in each equivalence class constitute a mega-processor. Each equivalence class consists of a contiguous 
subsequence of processors, in the sorted sequence O n , O2, ■ ■ ■ , Oi. We use m to denote the number of mega- 
processors (equivalence classes) . The processors in each equivalence class all have the same residual capacity. 
In Step 1 of the algorithm, we partition the processors according to which have the same residual capacity. 
We use Oi to denote the mega-processor containing the processors in equivalence class E\. 

In Step 2, we compute the amount of flow t that triggers one of the two stopping conditions. In order to 
do this, we need to know that rate at which the residual capacity of each processor within an equivalence 
class Ei will be reduced when flow is sent down the mega-processors in the order E m , . . . , E\. We use t;(i) to 
denote the amount by which the residual capacity of the processors in Ei is reduced when one unit of flow 
is sent in that order. 

The equation for £(i) follows from the lemmas and proof for the algorithm. We use fj{z) to denote the 
amount of flow that would reach processor z, if one unit of flow were sent down the permutation O n , . . . , Oi, 
where these are the original processors, not the mega-processors. This is precisely equal to the probability 
that random item x has fewer than k 0's in tests n,...,z + l. We compute the value of fj(z) for all z and 
k in a separate initialization routine. The key here is noticing that if you send one unit of flow down the 
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megaprocessors E m , . . . , E\, the amount of flow reaching mega-processor i is precisely fj(c(i)), where c(i) is 
the highest index of a processor in Ei ; the amount of flow reaching the megaprocessor depends only on how 
many O's have been encountered in test n, . . . , c(i) + l,on the order used to perform those tests. 

The quantity t\ is the amount of flow sent down E m , . . . , E\ that would cause saturation of the processors 
in Ei . The quantity ti is the minimum amount of flow sent down E m , . . . , E\ that would cause the residual 
capacities of two megaprocessors to equalize. The stopping condition holds at the minimum of these two 
quantities. 

The algorithm outputs a routing as a list of pairs of the form ((E m , . . . , E\), i), meaning that i flow 
should be sent down the permutation of megaprocessors (E m , . . . ,Ei). Of course, flow coming into each 
mega-processor is routed so as to equalize the load on each of its constituent processors. 

It is easy to see that the algorithm makes at most ^recursive calls, because mega-processors can only 
be merged a total of n — 1 times. Excluding the computation of the time spent in each recursive call 
is clearly 0(n). During each recursive call, the value of can be computed in time 0(nk) via dynamic 
programming. This yields a total running time of 0(n 2 k). 



Algorithm: MaxThroughput Initilization 

fiU) <" 0, Vj G {1, • • • , n}, V* G {0, . . . , k - 1}; 
/o(l) <- 1; 

for (j <— 2, i < n; i «— i + 1) do 

for (i <— 0, j < k — 1; j <- j + 1) do 

W) <~ Qj-ifi-iU - 1) +Pj-ifiU - i); 
return SolveMaxThroughput(pi, . . . ,p n ,ri, ■ . ■ , r n ); 



Example We illustrate our algorithm for the CMT(fc) problem on the following example, where k = 2 and 
n = 4. Suppose the probabilities are p\ = p 2 = pz = 1/2, p 4 = 3/4, and the rate limits are n = r 2 = 12, 
rs = r4 = 10. 

We will use the following fact, which is an easy corollary of Lemma 4. The strategies Tj are the cyclic 
permutation strategies defined in that Lemma. It follows immediately from what is shown in the proof of 
the lemma, namely that assigning flow unit to each Tj equalizes the load on each processor. 
Fact: Given processors 0\, . . . , O n , if R is a routing that assigns a sr ?^ 1 fraction of the total flow to 

2^ = 1 1] 

strategy Tj, then R uses the same amount of capacity in each processor. 

Our algorithm first combines processors with same rate limits into mega-processors; thus we combine 0\ 
and Oi into mega-processor with rate limit 12. It routes flow through this mega-processor by sending 
1/2 fraction of the flow in the order 0\, O2, and sending the other 1/2 fraction in the order O2, 0\. Similarly, 
O3 and O4 have the same rate limit, so they are combined into a mega-processor O3.4 with rate limit 10, 
where 1/3 fraction of the flow is sent along (9 3 , O4, and 2/3 fraction of the flow is sent along 04,03. 

Our mega-processor Oi^ has a higher rate limit than 03,4, consequently our algorithm routes flow in 
the order Oi^, 03^. We now show that the stopping condition is reached after sending 6 units of flow along 
this route, since after the 6 units of flow have been sent we have equalized the residual capacity of all the 
operators without saturating any of them. 

The 6 units of flow decreased the capacity of processors Oi, and O2 in O1.2 by 6, since k = 2 and thus 
flow cannot be discarded before it has been subject to at least two tests. 

We now calculate the reduction of capacity in O3 and O4 caused by the 6 unit of flow sent through 
01,2,03^4. Flow leaving O1.2 has a 1/4 probability of have failed both processors in Oi t 2 and exiting the 
system; for flow that stays in the system to be tested by O34, it has a 1/4 chance of having passed the test 
of both processors; it has a 1/2 chance of having passed the test of one processor and having failed the test 
of the other processor. Thus, of the 6 units of flow sent into Oi,2, 1/4x6 = 3/2 units are passed on to O3.4 
as type flow, and 1/2x6 = 3 units of flow are passed on to O34 as type 1 flow. 

Of the 3/2 units of type flow, entering O34, all of it must undergo both test 3 and test 4, since flow is 
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SolveMaxThroughput(pi, . . . ,p n ,ri, . . . , r n ) 

Input: n selectivities pi,. . . ,p n ; n rate limits n < . . . < r n 

Output: compact representation of solution to the MaxThroughput problem for the given input param- 
eters 

1. // form the equivalence classes E m , . . . , E\; 
E x 4- {Oi}; 

to 4— 1; // to is the number of equivalence classes 
Ri 4- n; 

for {£ 4- 2;£< n;£ 4- £ + 1) do 
if (r^ ^ n-i) then 
to <— m + 1; 
£ m <- {Oe}; 
R m <- r e ; 
else // (r e == re-i) 
E m 4- E m U{O e }; 

2. // calculate t using the following steps; 
for (i 4- 1; i < to; i 4- i + 1) do 

c(i) 4- highest index of an operator in Ef, 
b(i) 4— lowest index of an operator in Ef, 



if (r'j == 0) then / / residual capacity of equivalence class E\ is 

return K; 
else 

K' 4- SolveMaxThroughput(pi, . . . ,p n ,r[, . . . , r' n ); 
return K o K'\ j J i.e. the concatenation of K and K' 




t 4- min(ii, t 2 ); 



3. // calculate the residual capacity for each operator Oe; 
for (£4- 1;£< n;£4-£+l) do 

j 4- index of the equivalence class Ej containing operator Oe; 



r' e ^r e - f 0')*! 



4. // K = ({E, 



...,£?i},t); 



11 



not discarded until it has failed two tests. Thus that flow reduces the capacity of both O3 and O4 by 3/2 
units. 

Of the 3 units of type 1 flow entering 03,4, 1/3 is tested first by O3, and then by O4 only if it passes test 3 
(which it does with probability 1/2). The remaining 2/3 is tested first by O4, and then by O3 only if it passes 
test 4 (which is does with probability 3/4). Thus of the 3 units of type 1 flow, 3 x (1/3 + 2/3 x 3/4) = 5/2 
units reach <3 3 , and 3 x (2/3 + 1/3 x 1/2) = 5/2 units reach O4. 

Hence the 3+3/2 total units of flow entering 3 ,4 reduce the capacities of both 3 and O4 by 5/2+3/2 = 4. 

We have thus shown that the 6 units of flow sent first to Oi,2 and then to (?3,4, cause the residual 
capacities of 0\ and O2 to be 12 — 6 = 6, and the residual capacities of O3 and O4 to be 10 — 4 = 6. Thus 
the residual capacities of all operators equalize, as claimed. 

At this point our algorithm constructs a new mega-processor, by combining the processors in O12 with 
the processors in O3.4. All the processors in the resulting mega-processor, 01,2,3,4, have a residual capacity 
of 6. Using the proportional allocation in the routing of Lemma 4 to route flow into Oi, 2,3,4, we will route 
1/7 fraction of the flow along -k\ = {1, 2, 3, 4}, 2/7 fraction of the flow along the route TT2 = {2, 3, 4, 1}, 2/7 
fraction of the flow along the route 7r3 = {3, 4, 1, 2}, and 2/7 fraction of flow along the route 7T4 = {4, 1, 2, 3}. 

By sending 7 units of flow through Oi, 2,3,4 we decrease each processor's residual capacity by 6; thus 
saturating all processors. 

Our final routing achieves a throughput of 6 + 7 = 13 which is optimal. 

6 An Ellipsoid-Based Algorithm for the SMT(fc) problem 

There is a simple and elegant algorithm that solves the MinCost problem for standard k-oi-n testing, due 
to Salloum, Breuer, and (independently) Ben-Dov [9, 10, 1]. It outputs a strategy compactly represented by 
two permutations, one ordering the operators in increasing order of the ratio Cj/(1 — Pi), and the other in 
increasing order of the ratio Ci/pi. Chang et al. and Salloum and Breuer later gave modified versions of this 
algorithm that output a less compact, but more efficiently evaluatable representation of the same strategy 
[11,2]. 

We now show how to combine previous techniques to obtain a polynomial-time algorithm for the SMT(fc) 
problem based on the ellipsoid method. The algorithm uses a technique of Despande and Hellerstein [5]. 
They showed that, for 1-of-n testing, an algorithm solving the MinCost problem can be combined with the 
ellipsoid method to yield an algorithm for the MaxThroughput problem. In fact, as we see in the proof 
below, their approach is actually a generic one, and can be applied to testing of other functions. 

The ellipsoid-based algorithm for k-oi-n testing makes use of the dual of the LP for the CMT(fc) problem, 
which is as follows: 

Dual of Max-Throughput LP: Given n, . . . ,r« > 0, p-y . . . ,p n € (0, 1), find an assignment to the variables 
Ui, for all i G {1, . . . , n}, minimizing 

n 
i=l 

subject to the constraints: 

(!) ESLi fffr.Olfc > 1 for all T e T c and 
(2) Vi > for all i e {!,...,«}. 



Theorem 6. There is a polynomial-time algorithm, based on the ellipsoid method, for solving the SMT(k) 
problem. 

Proof. The approach of Deshpande and Hellerstein works as follows. The input consists of the Pi and the 
r*i, and the goal is to solve the MaxThroughput LP in time polynomial in n. The number of variables 
of the MaxThroughput LP is not polynomial, so the LP cannot be solved directly. Instead, the idea is 
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to solve it by first using the ellipsoid method to solve the dual LP. The ellipsoid method is run using an 
algorithm that simulates a separation oracle for the dual in time polynomial in n. During the running of 
the ellipsoid method, the violated constraints returned by the separation oracle are saved in a set M. Each 
constraint of the dual corresponds to an ordering T. When the ellipsoid method terminates, a modified 
version of the MaxThroughput LP is generated, which includes only the variables zt corresponding to 
orderings T in M (i.e. the other variables zt are set to 0). This modified version can then be solved 
directly using a polynomial-time LP algorithm. The resulting solution is an optimal solution for the original 
MaxThroughput LP. 

The above approach requires a polynomial-time algorithm for simulating the separation oracle for the 
dual. Deshpande and Hellerstein's method for simulating the separation oracle relies on the following observa- 
tions. In the dual LP for the MaxThroughput l-of-n testing problem, there are n\ constraints correspond- 
ing to the n\ permutations of the processors. The constraint for permutation 7r is ff(^i( 7r )) i)Vi ^ !• If 
one views y as a vector of costs, where the cost of i is t/j, then Y17=i #C^> ^)Vi 1S ^ ne expected cost of testing 
an item x using ordering T . Thus one can determine the ordering T that minimizes Yl7=i i)yi by solving 
the MinCost problem with probabilities p%, . . . ,p n and cost vector y. (Liu et al.'s approximation algorithm 
for generic MaxThroughput also relies on that observation [8].) 

If the MinCost ordering T has expected cost less than 1, then the constraint it corresponds to is violated. 
Otherwise, since the right hand side of each constraint is 1, y obeys all constraints. Thus simulating the 
separation oracle for the dual on input y can be done by first running the MinCost algorithm (with 
probabilities pi and costs yi) to find a MinCost ordering T. Once T is found, the values of the coefficients 
g(T,i) are calculated. These are used to calculate J27=i 0> the expected cost of T, If this value is less 
than 1, then the constraint y^IL i g(T, i) is returned. 

To apply the above approach to MaxThroughput for standard fc-of-n testing, we observe that in the 
dual LP for this problem, there is a constraint, Y%=1 #C^' — ■"•> f° r everv possible strategy T, We can 
simulate a separation oracle for the dual on input y by running a MinCost algorithm for standard fc-of-n 
testing. We also need to be able to compute the g(T, i) values for the strategy output by that algorithm. 
The algorithm of Chang et al. MinCost standard fc-of-n testing problem is suitable for this purpose, as it 
can easily be modified to output the g(T,i) values associated with its output strategy T [2]. □ 
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